Local Topological Signatures for Network-Based Prediction of Biological Function
نویسندگان
چکیده
In biology, similarity in structure or sequence between molecules is often used as evidence of functional similarity. In protein interaction networks, structural similarity of nodes (i.e., proteins) is often captured by comparing node signatures (vectors of topological properties of neighborhoods surrounding the nodes). In this paper, we ask how well such topological signatures predict protein function, using protein interaction networks of the organism Saccharomyces cerevisiae. To this end, we compare two node signatures from the literature – the graphlet degree vector and a signature based on the graph spectrum – and our own simple node signature based on basic topological properties. We find the connection between topology and protein function to be weak but statistically significant. Surprisingly, our node signature, despite its simplicity, performs on par with the other more sophisticated node signatures. In fact, we show that just two metrics, the link count and transitivity, are enough to classify protein function at a level on par with the other signatures suggesting that detailed topological characteristics are unlikely to aid in protein function prediction based on protein interaction networks.
منابع مشابه
Link Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملExploring Gene Signatures in Different Molecular Subtypes of Gastric Cancer (MSS/ TP53+, MSS/TP53-): A Network-based and Machine Learning Approach
Gastric cancer (GC) is one of the leading causes of cancer mortality, worldwide. Molecular understanding of GC’s different subtypes is still dismal and it is necessary to develop new subtype-specific diagnostic and therapeutic approaches. Therefore developing comprehensive research in this area is demanding to have a deeper insight into molecular processes, underlying these subtypes. In this st...
متن کاملCommon neighbours and the local-community-paradigm for topological link prediction in bipartite networks
Bipartite networks are powerful descriptions of complex systems characterized by two different classes of nodes and connections allowed only across but notwithin the two classes. Unveiling physical principles, building theories and suggesting physicalmodels to predict bipartite links such as productconsumer connections in recommendation systems or drug–target interactions inmolecular networks c...
متن کاملAn application of topological graph clustering to protein function prediction
We use a semisupervised learning algorithm based on a topological data analysis approach to assign functional categories to yeast proteins using similarity graphs. This new approach to analyzing biological networks yields results that are as good as or better than state of the art existing approaches.
متن کاملDistributed Generation Effects on Unbalanced Distribution Network Losses Considering Cost and Security Indices
Due to the increasing interest on renewable sources in recent years, the studies on integration of distributed generation to the power grid have rapidly increased. In order to minimize line losses of power systems, it is crucially important to define the size and location of local generation to be placed. Minimizing the losses in the system would bring two types of saving, in real life, one is ...
متن کامل